Characterizing Model Errors and Di erences
نویسندگان
چکیده
A critical component of applying machine learning algorithms is evaluating the performance of the models induced and using the evaluation to guide further development. Traditionally the most common evaluation metric is error or loss, however this provides very little information for the designer to use when constructing a system. We argue that an evaluation method should provide detailed feedback on the performance of an algorithm and that this feedback should be in the language of the problem: Our goal is to characterize model errors or the di erences between models in the feature space. We provide a framework for this that allows di erent algorithms to be used as the discovery engine and we consider two approaches: (1) a classi cation strategy where we use a standard rule learner such as C5; (2) a descriptive paradigm where we use a new discovery algorithm: a contrast set miner. We show that C5 su ers from several problems that make it unsuitable for this task.
منابع مشابه
A Practitioner's Guide to Cluster-Robust Inference
We consider statistical inference for regression when data are grouped into clusters, with regression model errors independent across clusters but correlated within clusters. Examples include data on individuals with clustering on village or region or other category such as industry, and state-year di erences-in-di erences studies with clustering on state. In such settings default standard erro...
متن کاملInequality and the Lifecycle1
This paper investigates the sources of cross-sectional di¤erences in consumption, labor supply, wealth and welfare over the lifecycle. I document the existence of rich and informative lifecycle patterns in the joint distribution of wages, hours, consumption and wealth. I then estimate a structural model of precautionary savings with endogenous labor supply and uninsurable wage risk in an attemp...
متن کاملThe generation of fuzzy sets and the~construction of~characterizing functions of~fuzzy data
Measurement results contain different kinds of uncertainty. Besides systematic errors andrandom errors individual measurement results are also subject to another type of uncertainty,so-called emph{fuzziness}. It turns out that special fuzzy subsets of the set of real numbers $RR$are useful to model fuzziness of measurement results. These fuzzy subsets $x^*$ are called emph{fuzzy numbers}. The m...
متن کاملEfficiency analysis in the presence of uncertainty
In a stochastic decision environment, di¤erences in information can lead rational decision makers facing the same stochastic technology and the same markets to make di¤erent production choices. E¢ ciency and productivity measurement in such a setting can be seriously and systematically biased by the manner in which the stochastic technology is represented. For example, conventional production f...
متن کاملCournot equilibrium as emergent behavior in a nonrenewable resource agent-based model
In a simple agent-based model of a small oligopoly nonrenewable natural resource model, the agents, communicating solely through the market price, sometimes exhibit collusion-like behavior, sometimes Cournot-like behavior. The collusion-like behavior is shown to arise when di erences between the agents are small. Conversely, the Cournot-like behavior is shown to result from di erences in produc...
متن کامل